An efficient method to estimate pronunciation from multiple utterances

نویسندگان

  • Tofigh Naghibi
  • Sarah Hoffmann
  • Beat Pfister
چکیده

Given K utterances of a word and a set of sub-word units one may need a generalization of the conventional one-dimensional Viterbi algorithm to jointly decode them in order to derive their underlying word model (pronunciation). This extension is called k-dimensional Viterbi. However, as the number of utterances increases, the complexity of the k-dimensional Viterbi algorithm exponentially increases causing prohibitive computational burden. Here, we propose an approximation algorithm for the k-dimensional Viterbi which efficiently uses the available utterances to estimate the pronunciation. In addition to automatic dictionary generation, it can be used in computationally expensive applications such as lexicon-free training and joint pattern alignment.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Breadth-first search for finding the optimal phonetic transcription from multiple utterances

Extending the vocabulary of a large vocabulary speech recognition system usually requires phonetic transcriptions for all words to be known. With automatic phonetic baseform determination acoustic samples of the words in question can substitute for the required expert knowledge. In this paper we follow a probabilitistic approach to this problem and present a novel breadth-first search algorithm...

متن کامل

Italian speakers learn lexical stress of German morphologically complex words

Italian speakers tend to stress the second component of German morphologically complex words such as compounds and prefix verbs even if the first component is lexically stressed. To improve their prosodic phrasing an automatic pronunciation teaching method was developed based on auditory feedback of prosodically corrected utterances in the learners’ own voices. Basically, the method copies cont...

متن کامل

Considerations on vowel durations for Japanese CALL system

Due to various difficulties in pronunciation, utterances by nonnative speakers may be lacking in fluency. The Japanese pronunciation is said to have mora-synchronism, and, therefore, we assume that the disfluency may cause larger variations in vowel durations. Analyses of vowel (and CV) durations were conducted for Japanese sentence utterances by 2 non-Japanese speakers and one Japanese speaker...

متن کامل

Feature-based Pronunciation Modeling for Speech Recognition

We present an approach to pronunciation modeling in which the evolution of multiple linguistic feature streams is explicitly represented. This differs from phone-based models in that pronunciation variation is viewed as the result of feature asynchrony and changes in feature values, rather than phone substitutions, insertions, and deletions. We have implemented a flexible feature-based pronunci...

متن کامل

Prediction of American listeners’ misrecognition of English words spoken by Japanese

This study tries to automatically estimate the probability of individual spoken words of Japanese English (JE) being perceived correctly by American listeners and to clarify what kind of combinations of segmental, prosodic, and/or linguistic errors are more fatal to the correct recognition. Firstly, from a large speech database of JE, a balanced set of 360 utterances of 90 male speakers were se...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013